-
-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add code to generalize data using different strategies #1830
Conversation
Is the main goal here to verify that the algorithms produce useful results before implementing in osm2pgsql? |
This is part of the generalization effort (https://osm2pgsql.org/generalization/). For some context see
Yes. And to see how performant it is. It doesn't help us if it can't be used in real-world scenarios. |
Interesting reads those blog posts. A couple of remarks though:
|
A few more remarks:
|
You are raising some valid points, @mboeringa. In fact the first version of the code in this PR would generate polygons that were all invalid! (The rings were not closed, which is fixed now.) I also added an optional Unfortunately the potrace library will sometimes generate invalid polygons. I have seen two cases: Sometimes it snips off corners of the tiles it renders, this can be mitigated by having larger margins around tiles, but that's certainly something to look out for. The other happens when rendering polygons touching in single points (think of the figure "8") which must be modelled as two polygons, not one. There is another problem with the potrace library which I couldn't figure out yet: The left side of all polygons is slightly to the left of where it should be, the right side is okay. Maybe some kind of rounding error. You can't usually see it in the results though, because the effect is small, but still somewhat annoying. You also mentioned the |
Having a guarantee of validity is very important since lots of PostGIS functions require validity. |
README-gen.md
Outdated
* `-w, --width=WIDTH`: Size of the imaged rendered when using the `raster-union` | ||
strategy, not used in the `vector-union` strategy. | ||
* `-b, --buffer=BUFFER`: The amount by which the polygons will be buffered. For | ||
the `raster-union` strategy this is in pixels, for the `vector-union` strategy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation doesn't say what resolution is used per tile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sorry, there is a lot the documentation doesn't spell out clearly. That's because this is still work-in-progress. You have to do a bit of experimenting if you want to try this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things I noticed while getting it running. Not a complete review.
- There's a bunch of hard-coded 3857 assumptions. If we're okay with these, we need to document them.
src/osm2pgsql-gen.cpp
Outdated
connection->query( | ||
PGRES_COMMAND_OK, | ||
"PREPARE get_extent AS WITH ex AS " | ||
"(SELECT ST_EstimatedExtent('{}', 'geom') AS e) " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hard-codes the geometry column to geom
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixes in new version
src/osm2pgsql-gen.cpp
Outdated
xmax = 0; | ||
ymax = 0; | ||
} else { | ||
if (x == 0 && y == 0 && xmax == 0 && ymax == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zero shouldn't be treated as a special value, because you might just want to generate the 0/0 tile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in new version
README-gen.md
Outdated
* `-b, --buffer=BUFFER`: The amount by which the polygons will be buffered. For | ||
the `raster-union` strategy this is in pixels, for the `vector-union` strategy | ||
this is in Mercator map units. | ||
* `-g, --group-by-column=COL`: Set this to the column describing the type of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are multiple columns indicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not possible. If that's really something we need, we can add that later. But I am trying to keep the config to a minimum.
New version documents that this only works for 3857. Also added a "make_valid" option to ensure results are valid polygons. Lots of bug fixes. And other stuff... I'll get around to documenting this eventually... |
README-gen.md
Outdated
called `type` with the type of the landcover (forest, grass, etc.). | ||
|
||
Create a table `landcover_z10` with (at least) columns `geom` and `type` as | ||
well as integer columns `x`, and `y`. Then call this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An index on x, y
is also needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use osm2pgsql-gen
on larger extracts or the planet you need to run it with -D
on first import to suppress DELETE
s, then built indexes on x, y
and when you do updates run without the -D
. The problem is that by default osm2pgsql-gen
will DELETE
all features in a tile in the output table before loading them. But when you start that for the first time, i.e. when the PREPARE
is done, the table is empty, so even if there is an index PostgreSQL will decide not to use it. That's fine in the beginning, but as the table grows everything becomes slower and slow.
After the import you create the index and do an ANALYZE
and from then on everything is fine when doing updates.
This is a bit tricky, but once we put this functionality into osm2pgsql itself, we know whether we are in import or append mode and can do the right thing automatically.
I've been experimenting with the results now, and am having an error where the lower-left corner of each tile's polygon is being clipped away I created the base table with CREATE MATERIALIZED VIEW
osm_planet_trees
AS
SELECT
osm_id,
way,
name,
ref,
way_area
FROM planet_osm_polygon
WHERE "natural" = 'wood' OR landuse = 'forest' then ran The result in qgis looks like this: |
I believe that's already been fixed in the current version of this PR. Did you use that? |
dd1c8f3
to
7891280
Compare
This comment was marked as off-topic.
This comment was marked as off-topic.
e56cb0e
to
f7430aa
Compare
285255d
to
0613222
Compare
This commit adds the code to generalize various types of data using different strategies. The following strategies work on a tile-by-tile basis and operate on polygons: The "vector-union" strategy buffers and unionizes polygons using vector operations. The "raster-union" strategy does a similar thing but does it in raster space which is much faster. First the polygons are rendered into a raster, an open/close operation is called (which basically does the same thing as the buffering in vector space) and finally the resulting raster is vectorized again. The "builtup" strategy is intended to derive a layer of builtup areas from landuse=residential/industrial etc. as well as building cover and dense road networks. This still needs some work... Also a new "discrete-isolation" strategy which rates places based on some importance metric. (This is not tile-based.) The new "rivers" strategy finds important rivers, this is still very much work in progress. For the raster support this adds two new library dependency: CImg and potrace. The functionality is accessed through a new command line program called osm2pgsql-gen. It reads the same Lua config file that osm2pgsql reads. Call it with -h to get some usage information. This program is for testing only, eventually the functionality should be accessible from osm2pgsql itself. See also https://osm2pgsql.org/generalization/ .
Closing in favour of #1942. |
This commit adds the code to generalize polygons using two different strategies. In both cases polygons are merged and simplified on a tile-by-tile basis.
The "vector" strategy uses the commonly used approach where polygons are buffered, their union is calculated and then a reverse buffer applied, followed by a final buffer again.
The "raster" strategy does a similar thing but does it in raster space which is much faster. First the polygons are rendered into a raster, an open/close operation is called (which basically does the same thing as the buffering in vector space) and finally the resulting raster is vectorized again.
For the raster support this adds two new library dependency: CImg and potrace.
The functionality is accessed through a new command line program called osm2pgsql-gen. Call it with -h to get some usage information. This program is for testing only, eventually the functionality should be accessible from osm2pgsql itself (using the Lua config file for configuration).
Some additional information is in README-gen.md